#THIS IS STILL A WORK IN PROGRESS ***
A choropleth map is a thematic map in which areas are shaded or patterned in proportion to the measurement of the statistical variable being displayed on the map, such as population density or per-capita income. In this exercise we are going to produce the map below.
#Insert example of the finished choropleth map we will build
Boundaries are the shapes that make up the areas you wish to visualise. The boudnaries could be the outline of countries, postcode areas, voting areas or any other geographical boudnary. Explanation of polygon shape files and where to to get them from.
If your analysis is focused on the UK then the Office of National Statistics (ONS) Open Geography Portal is a fantastic resource. It has a wide range of geographies for the UK.
The Hierarchical Representation of UK Statistical Geographies gives a great overview of the geographies available.
You can find the boundaries in the Boundaries menu in the portal.
For this project we are going to use the English Regions. As we are only visualizing the data we will use the Ultra generalised version as it is much smaller in size.
You can download the boundaries as a file or load it directly in r from the ONS website.
To load it directly from the ons website you need a GeoJson link. To get the link click on the API drop down and copy the GeoJSON link.
Here we load the data into a SpatialPolygonDataFrame. Using the glimpse function on the data slot shows us there are nine polygons, one for each of the English regions. Each region has a region code stored in the rgn18cd column.
#Read the data from the ons website using the link from the site
uk_regions <- geojsonio::geojson_read("https://opendata.arcgis.com/datasets/bafeb380d7e34f04a3cdf1628752d5c3_0.geojson", what = "sp")
#Get glimpse of the data
glimpse(uk_regions@data)
## Observations: 9
## Variables: 9
## $ objectid <int> 1, 2, 3, 4, 5, 6, 7, 8, 9
## $ rgn18cd <fct> E12000001, E12000002, E12000003, E12000004, E12...
## $ rgn18nm <fct> North East, North West, Yorkshire and The Humbe...
## $ bng_e <int> 417313, 350015, 446903, 477660, 386294, 571074,...
## $ bng_n <int> 600358, 506280, 448736, 322635, 295477, 263229,...
## $ long <dbl> -1.728900, -2.772370, -1.287120, -0.849670, -2....
## $ lat <dbl> 55.29703, 54.44945, 53.93264, 52.79572, 52.5569...
## $ st_areashape <dbl> 8607315188, 14178268574, 15427653555, 156593394...
## $ st_lengthshape <dbl> 647753.7, 1080027.2, 874559.8, 895370.0, 774627...
If we added these polygons to a map now we would see the nine regions of England.
#Create a map and add the polygons
m <- leaflet(uk_regions) %>%
addPolygons()
#display the map
m
For this project we are going to use the Annual data on Civil Service employment in the UK. I have taken the data from table 13 to get the number of permanently employed civil servants in English regions.
#Read the csv
civies <- read.csv("data/perm_civil_servants.csv")
#Show the data in a databtable
datatable(civies)
We are going to combine this data with the ONS’s estimated population data for the regions. I have taken the data from table MYE2 and put it in the file pop_est_2017.csv.
We will now load this.
#Read the csv
population <- read.csv("data/pop_est_2017.csv")
#Show the data in a databtable
datatable(population)
We will now join the civies data to the population data.
#Join the civies data with the population data
merged <- left_join(x = civies, y = population, by = c("rgn18cd"))
#Display the result
datatable(merged)
#Add a new column
merged <- mutate(merged, civi_percent = round((total / pop_est_2017)*100, 2))
#Display the result
datatable(merged)
We can now join the merged dataset onto the data frame stored in the data slot of our SpatialPolygonsDataFrame.
#Join the uk_regions@data to the dataset containing population and civil servant values
uk_regions@data <- uk_regions@data %>%
left_join(merged, by = c("rgn18cd"))
Now we are ready to visualise the data by colouring each polygon.
We will start by creating a colour pallette using colourbrewer.
# Create a continuous palette function
pal <- colorNumeric(palette = "Blues", domain = uk_regions@data$civi_percent)
#Code to colour the polygons based on the value
m <- leaflet(uk_regions) %>%
addPolygons(stroke = FALSE, smoothFactor = 0.2, fillOpacity = 1, color = ~pal(civi_percent), highlightOptions = highlightOptions(color = "white", weight = 2, bringToFront = TRUE))
#Display the map
m
Adding legends,
#Adding legends to the map and other finishing touches
#Get world countries vectors from rnaturalearth package
my_world <- ne_download(scale = 50, type = 'countries', category = 'cultural')
## OGR data source with driver: ESRI Shapefile
## Source: "C:\Users\barry\AppData\Local\Temp\RtmpGUmSan", layer: "ne_50m_admin_0_countries"
## with 241 features
## It has 94 fields
## Integer64 fields read as strings: POP_EST NE_ID
#Remove anartica
no_penguins <- my_world[ my_world@data$SOVEREIGNT != "Antarctica" ,]
m <- leaflet(data = no_penguins) %>%
addPolygons(stroke = .1, smoothFactor = 0.2, fillOpacity = 0)
m